3 research outputs found
MediViSTA-SAM: Zero-shot Medical Video Analysis with Spatio-temporal SAM Adaptation
In recent years, the Segmentation Anything Model (SAM) has attracted
considerable attention as a foundational model well-known for its robust
generalization capabilities across various downstream tasks. However, SAM does
not exhibit satisfactory performance in the realm of medical image analysis. In
this study, we introduce the first study on adapting SAM on video segmentation,
called MediViSTA-SAM, a novel approach designed for medical video segmentation.
Given video data, MediViSTA, spatio-temporal adapter captures long and short
range temporal attention with cross-frame attention mechanism effectively
constraining it to consider the immediately preceding video frame as a
reference, while also considering spatial information effectively.
Additionally, it incorporates multi-scale fusion by employing a U-shaped
encoder and a modified mask decoder to handle objects of varying sizes. To
evaluate our approach, extensive experiments were conducted using
state-of-the-art (SOTA) methods, assessing its generalization abilities on
multi-vendor in-house echocardiography datasets. The results highlight the
accuracy and effectiveness of our network in medical video segmentation
MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image Segmentation
The Segment Anything Model (SAM), a foundation model for general image
segmentation, has demonstrated impressive zero-shot performance across numerous
natural image segmentation tasks. However, SAM's performance significantly
declines when applied to medical images, primarily due to the substantial
disparity between natural and medical image domains. To effectively adapt SAM
to medical images, it is important to incorporate critical third-dimensional
information, i.e., volumetric or temporal knowledge, during fine-tuning.
Simultaneously, we aim to harness SAM's pre-trained weights within its original
2D backbone to the fullest extent. In this paper, we introduce a
modality-agnostic SAM adaptation framework, named as MA-SAM, that is applicable
to various volumetric and video medical data. Our method roots in the
parameter-efficient fine-tuning strategy to update only a small portion of
weight increments while preserving the majority of SAM's pre-trained weights.
By injecting a series of 3D adapters into the transformer blocks of the image
encoder, our method enables the pre-trained 2D backbone to extract
third-dimensional information from input data. The effectiveness of our method
has been comprehensively evaluated on four medical image segmentation tasks, by
using 10 public datasets across CT, MRI, and surgical video data. Remarkably,
without using any prompt, our method consistently outperforms various
state-of-the-art 3D approaches, surpassing nnU-Net by 0.9%, 2.6%, and 9.9% in
Dice for CT multi-organ segmentation, MRI prostate segmentation, and surgical
scene segmentation respectively. Our model also demonstrates strong
generalization, and excels in challenging tumor segmentation when prompts are
used. Our code is available at: https://github.com/cchen-cc/MA-SAM
Recommended from our members
Three-dimensional Cardiomyocytes Structure Revealed By Diffusion Tensor Imaging and Its Validation Using a Tissue-Clearing Technique
We characterized the microstructural response of the myocardium to cardiovascular disease using diffusion tensor imaging (DTI) and performed histological validation by intact, un-sectioned, three-dimensional (3D) histology using a tissue-clearing technique. The approach was validated in normal (n = 7) and ischemic (n = 8) heart failure model mice. Whole heart fiber tracking using DTI in fixed ex-vivo mouse hearts was performed, and the hearts were processed with the tissue-clearing technique. Cardiomyocytes orientation was quantified on both DTI and 3D histology. Helix angle (HA) and global HA transmurality (HAT) were calculated, and the DTI findings were confirmed with 3D histology. Global HAT was significantly reduced in the ischemic group (DTI: 0.79 ± 0.13°/% transmural depth [TD] and 3D histology: 0.84 ± 0.26°/%TD) compared with controls (DTI: 1.31 ± 0.20°/%TD and 3D histology: 1.36 ± 0.27°/%TD, all p < 0.001). On direct comparison of DTI with 3D histology for the quantitative assessment of cardiomyocytes orientation, significant correlations were observed in both per-sample (R2 = 0.803) and per-segment analyses (R2 = 0.872). We demonstrated the capability and accuracy of DTI for mapping cardiomyocytes orientation by comparison with the intact 3D histology acquired by tissue-clearing technique. DTI is a promising tool for the noninvasive characterization of cardiomyocytes architecture